Large Data Sets, Conditional Entropy and the Cooper-Herskovitz Bayesian Score
نویسندگان
چکیده
We examine the relationship between the Cooper-Herskovitz score of a Bayesian network and the conditional entropies of the nodes of the networks conditioned on the probability distributions of their parents. We show that minimizing the conditional entropy of each node of the BNS conditioned on its set of parents amounts to maximization of the CH score. The main result is a lower bound on the size of the data set that ensures that the divergence of between conditional entropy and the Cooper-Herskovitz score is under a certain threshold.
منابع مشابه
Evaluating Bayesian Networks by Sampling with Simplified Assumptions
The most common fitness evaluation for Bayesian networks in the presence of data is the Cooper-Herskovitz criterion. This technique involves massive amounts of data and, therefore, expansive computations. We propose a cheaper alternative evaluation method using simplified assumptions which produces evaluations that are strongly correlated with the Cooper-Herskovitz criterion .
متن کاملProperties of Weak Conditional Independence
Object-oriented Bayesian networks (OOBNs) facilitate the design of large Bayesian networks by allowing Bayesian networks to be nested inside of one another. Weak conditional independence has been shown to be a necessary and sufficient condition for ensuring consistency in OOBNs. Since weak conditional independence plays such an important role in OOBNs, in this paper we establish two useful resu...
متن کاملA Preferred Definition of Conditional Rényi Entropy
The Rényi entropy is a generalization of Shannon entropy to a one-parameter family of entropies. Tsallis entropy too is a generalization of Shannon entropy. The measure for Tsallis entropy is non-logarithmic. After the introduction of Shannon entropy , the conditional Shannon entropy was derived and its properties became known. Also, for Tsallis entropy, the conditional entropy was introduced a...
متن کاملHow To Use catnet Package
The catnet package implements categorical Bayesian network framework in R. Bayesian networks are graphical statistical models that represent directed dependencies between random variables and thus are able to model causal relationships among these variables. A Bayesian network has two components: Directed Acyclic Graph (DAG) with nodes the variables of interest and a probability structure given...
متن کاملStructure Inference of Bayesian Networks from Data: A New Approach Based on Generalized Conditional Entropy
We propose a novel algorithm for extracting the structure of a Bayesian network from a dataset. Our approach is based on generalized conditional entropies, a parametric family of entropies that extends the usual Shannon conditional entropy. Our results indicate that with an appropriate choice of a generalized conditional entropy we obtain Bayesian networks that have superior scores compared to ...
متن کامل